SimCompass: Using Deep Learning Word Embeddings to Assess Cross-level Similarity
نویسندگان
چکیده
This article presents our team’s participating system at SemEval-2014 Task 3. Using a meta-learning framework, we experiment with traditional knowledgebased metrics, as well as novel corpusbased measures based on deep learning paradigms, paired with varying degrees of context expansion. The framework enabled us to reach the highest overall performance among all competing systems.
منابع مشابه
Learning Effective Word Embedding using Morphological Word Similarity
Deep learning techniques aim at obtaining high-quality distributed representations of words, i.e., word embeddings, to address text mining and natural language processing tasks. Recently, efficient methods have been proposed to learn word embeddings from context that captures both semantic and syntactic relationships between words. However, it is challenging to handle unseen words or rare words...
متن کاملLearning Word Meta-Embeddings by Using Ensembles of Embedding Sets
Word embeddings – distributed representations of words – in deep learning are beneficial for many tasks in natural language processing (NLP). However, different embedding sets vary greatly in quality and characteristics of the captured semantics. Instead of relying on a more advanced algorithm for embedding learning, this paper proposes an ensemble approach of combining different public embeddi...
متن کاملDeep Multilingual Correlation for Improved Word Embeddings
Word embeddings have been found useful for many NLP tasks, including part-of-speech tagging, named entity recognition, and parsing. Adding multilingual context when learning embeddings can improve their quality, for example via canonical correlation analysis (CCA) on embeddings from two languages. In this paper, we extend this idea to learn deep non-linear transformations of word embeddings of ...
متن کاملKnowledge-Powered Deep Learning for Word Embedding
The basis of applying deep learning to solve natural language processing tasks is to obtain high-quality distributed representations of words, i.e., word embeddings, from large amounts of text data. However, text itself usually contains incomplete and ambiguous information, which makes necessity to leverage extra knowledge to understand it. Fortunately, text itself already contains welldefined ...
متن کاملHCCL at SemEval-2017 Task 2: Combining Multilingual Word Embeddings and Transliteration Model for Semantic Similarity
In this paper, we introduce an approach to combining word embeddings and machine translation for multilingual semantic word similarity, the task2 of SemEval2017. Thanks to the unsupervised transliteration model, our cross-lingual word embeddings encounter decreased sums of OOVs. Our results are produced using only monolingual Wikipedia corpora and a limited amount of sentence-aligned data. Alth...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014